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(54) Voice reference apparatus, recording medium recording voice reference control program 
and voice recognition navigation apparatus 



(57) A voice reference apparatus that classifies a 
plurality of search targets into a plurality of division 
blocks, searches for a search target by first specifying 
a division block and then specifying the search target 
and enables specification of, at least, the search target 
to be made by voice, includes: a first storage device in 
which recognition data related to search targets corre- 
sponding to individual division blocks are stored: a sec- 
ond storage device in which division block-related infor- 
mation indicating one or more other division blocks re- 
lated to a given division block through a specific rela- 



tionship is stored; a recognition data selection device 
that selects recognition data corresponding to only a 
certain division block and one or more other division 
blocks related to the certain division block specified by 
the division block-related information from the first stor- 
age device, when the certain division block has been 
specified; and a voice recognition processing device 
that performs voice recognition based upon voice rec- 
ognition data generated by using the recognition data 
selected by the recognition data selection device and 
audio data corresponding to the search target specified 
by voice. 
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Description 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

[0001] The present invention relates to a voice refer- 
ence system and a voice recognition navigation appa- 
ratus using the voice reference system. 

2. Description of the Related Art 

[0002] There are car navigation apparatuses (hereaf- 
ter referred to as navigation apparatuses) that display 
the current position of the vehicle, display a map over a 
wide area or in detail and provide guidance to the driver 
along the traveling direction over the remaining distance 
to the destination in the prior art. There are also voice 
recognition navigation apparatuses in the prior art hav- 
ing a function of enabling the driver engaged in driving 
to issue operating instructions by voice to improve driver 
safety (see Japanese Laid-Open Patent Publication No. 
09-292255, for instance). 

[0003] FIGS. 11 A — 11 D illustrate the concept of voice 
recognition dictionaries (hereafter simply referred to as 
dictionaries) used in a navigation apparatus in the prior 
art to display a desired ski resort in a map through voice 
instructions. 

[0004] When the power to the navigation apparatus is 
turned on, the basic dictionary shown in FIG. 11 A is 
opened in the memory. In the basic dictionary, instruc- 
tion phrases such as "bird's eye view display," "enlarge, 
" "reduce" and "ski resorts" are stored as recognition 
words. 11 the user says (vocalizes) "ski resorts" to specify 
a facility category, voice recognition processing is per- 
formed on all the recognition words in the basic diction- 
ary. When "ski resorts" is recognized as the result of the 
voice recognition processing, a ski resort prefecture 
name dictionary, which contains prefecture names 
where ski resorts are present as recognition words is 
opened in memory, as shown in FIG. 11 B. 
[0005] Then, if the user says M ABCD Prefecture," for 
instance, to specify the prefecture where the desired ski 
resort is present, voice recognition processing is per- 
formed on all the recognition words in the prefecture 
name dictionary. It "ABCD Prefecture" is recognized as 
thfc ,o .rult of the voice recognition processing, an ABCD 
Prefecture ski resort name dictionary containing the 
names of ski resorts present in ABCD Prefecture as rec- 
ognition words is opened in memory as shown in FIG. 
1 1 C. Next, the user says "B Ski Resort" to specify a ski 
resort, and in response, voice recognition processing is 
performed on all the recognition words in the ABCD Pre- 
fecture ski resort name dictionary. After "B Ski Resort" 
is recognized through the voice recognition processing, 
a map containing B Ski Resort is displayed on the 
screen of the navigation apparatus as shown in FIG. 
11D. 



[0006] In addition to ski resorts, there are various fa- 
cility categories that need to be recognized by the voice 
recognition software program, such as theme parks and 
airports. Many of such facilities are located near prefec- 

5 tural borders. For instance, there is a ski resort located 
near the prefectural border of Gunma Prefecture and Ni- 
igata Prefecture, a theme park located near the prefec- 
tural border of Tokyo Prefecture and Chiba Prefecture 
and an airport located near the prefectural border of Os- 

io aka Prefecture and Hyogo Prefecture. In addition, in the 
case of a vast facility such as a golf course or a ski resort, 
the user may not be certain which prefecture the facility 
belongs to. 

[0007] If the user inputs the wrong prefecture when 
is specifying the prefecture where the facility is located in 
such a situation, the facility name dictionary in the wrong 
prefecture where the facility is not located is opened in 
memory and accessed. Thus, a problem occurs in that 
a successful recognition is not achieved no matter how 
20 many times the user subsequently says the accurate fa- 
cility name. 

SUMMARY OF THE INVENTION 

25 [0008] An object of the present invention is to provide 
a voice reference apparatus capable of performing a 
search for a reference target through voice recognition 
quickly, efficiently and accurately with a high degree of 
reliability and a recording medium that records a control 

30 program used in the process. More specifically, the ob- 
ject of the present invention is to provide a voice recog- 
nition navigation apparatus capable of achieving accu- 
rate voice recognition of the names of facilities located 
near the borders of public administrative zones (dis- 

35 tricts). 

[0009] In order to attain the above object, a voice ref- 
erence apparatus according to the present invention 
that classifies a plurality of search targets into a plurality 
of division blocks, searches for a search target by first 

40 specifying a division block and then specifying the 
search target and enables specification of, at least, the 
search target to be made by voice, comprises: a first 
storage device in which recognition data related to 
search targets corresponding to individual division 

45 blocks are stored; a second storage device in which di- 
vision block-related information indicating one or more 
other division blocks related to a given division block 
through a specific relationship is stored; a recognition 
data selection device that selects recognition data cor- 

so responding to only a certain division block and one or 
more other division blocks related to the certain division 
block specified by the division block-related information 
from the first storage device, when the certain division 
block has been specified; and a voice recognition 

55 processing device that performs voice recognition 
based upon voice recognition data generated by using 
the recognition data selected by the recognition data se- 
lection device and audio data corresponding to the 
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search target specified by voice. 
[001 0] In this voice reference apparatus, it is preferred 
that: the plurality of division blocks are public adminis- 
trative zones; the search target is located in one of the 
public administrative zones; and the division block-re- 
lated information indicates one or more other public ad- 
ministrative zones related to a specified public adminis- 
trative zone through a specific relationship. In this case, 
it is preferred that: the public administrative zones are 
each constituted of a prefecture, a state or a country. 
Also, it is preferred that the division block-related infor- 
mation indicates one or more other public administrative 
zones adjacent to a specified public administrative 
zone. In this case : it is preferred that the recognition data 
related to the search target includes information related 
toa public administrative zone in which the search target 
is located . Furthermore, it is prelerred that a display 
control device that implements control to display details 
related to results of a search of the search target on a 
display device is further provided, and when implemenl- 
ing control to display the details related to the results of 
the search of the search target, the display control de- 
vice also displays on the display device information re- 
lated to the public administrative zone in which the 
search target is located. 

[0011] A voice recognition navigation apparatus ac- 
cording to the present invention, comprises: a voice ref- 
erence apparatus; a map information storage device 
that stores map information; and a control device that 
implements control for providing route guidance based 
upon, at least, results of a search performed by the voice 
reference apparatus and the map information. And the 
voice relerence apparatus, which classifies a plurality of 
search targets into a plurality of division blocks, search- 
es for a search target by first specifying a division block 
and then specifying the search target and enables spec- 
ification of, at least, the search target to be made by 
voice, comprises: a first storage device in which recog- 
nition data related to search targets corresponding to 
individual division blocks are stored; a second storage 
device in which division block-related information indi- 
cating one or more other division blocks related to a giv- 
en division block through a specific relationship is 
stored; a recognition data selection device that selects 
recognition data corresponding to only a certain division 
block and one or more other division blocks related to 
the certain division block specified by the division block- 
relaled information from the firsl storage device, when 
the certain division block has been specified; and a 
voice recognition processing device that performs voice 
recognition based upon voice recognition data generat- 
ed by using the recognition data selected by the recog- 
nition data selection device and audio data correspond- 
ing to the search target specified by voice. 
[0012] A voice reference control program according 
to the present invention for searching for a search target 
specified by voice, by first specifying a division block and 
then specifying the search target, comprises: an instruc- 



tion for reading recognition data related to search tar- 
gets, a plurality of the search targets being classified 
into a plurality of division blocks; an instruction for read- 
ing data related to division block-related information in- 

5 dicating one or more other division blocks related to a 
given block through a specific relationship; an instruc- 
tion for selecting recognition data corresponding to only 
a certain division block and one or more other division 
blocks related to the certain division block specified by 

10 the division block-related information when the certain 
division block has been specified; and an instruction for 
implementing a voice recognition based upon voice rec- 
ognition data generated by using the recognition data 
that have been selected and audio data corresponding 

is to the search target specified by voice. 

[0013] A recording medium according to the present 
invention records the above voice reference control pro- 
gram. 

[0014] A data signal according to the present inven- 
20 tion comprises Ihe above voice reference conlrol pro- 
gram and is transmitted in a communication line, 

BRIEF DESCRIPTION OF THE DRAWINGS 



20 



25 [0015] 

FIG. 1 illustrates the structure assumed by the car 
navigation system according to the present inven- 
tion in a first embodiment; 

30 FIGS. 2A *- 2C show recognition dictionaries relat- 
ed to the ski resort category among recognition dic- 
tionaries used in the first embodiment; 
FIG. 3 shows a neighboring prefecture table; 
FIGS. 4Aand 4B present an example of how neigh- 

35 boring prefectures may be assigned for each pre- 
fecture; 

FIG. 5 is a flowchart of the control implemented to 
reference the name of a facility in a given prefec- 
ture; 

40 FIG. 6 is a flowchart continuing from the flowchart 
in FIG. 5; 

FIG. 7 is a flowchart continuing from the flowchart 
in FIG. 6; 

FIGS. 8A — 8C show recognition dictionaries relat- 
es - ed to the ski resort category among recognition dic- 
tionaries used in a second embodiment, presenting 
an example in which an area is divided-in.- units of 
individual slates; 

FIG. 9 presents a neighboring state table; 

50 FIG. 10 illustrates how the program may be provid- 
ed via a transmission medium; and 
FIGS. 1 1 A ~ 1 1 D illustrate the concept of the voice 
recognition dictionaries used in a navigation appa- 
ratus in the prior art to display a map containing a 

55 desired ski resort through voice instruction. 
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DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

- First Embodiment - 

[0016] FIG. 1 shows the structure adopted by the car 
navigation system in the first embodiment of the present 
invention. The car navigation system comprises a nav- 
igation apparatus 100 and a voice unit 200. 
[0017] The navigation apparatus 100 comprises a 
GPS receiver 101 , a gyro sensor 102, a vehicle speed 
sensor 103, a driver 104, a CPU 105, a RAM 106, a 
ROM 107, a CD-ROM drive 108 ; a display device 109, 
a bus line 110 and the like. 

[0018] The voice unit 200 comprises a microphone 
201 , an A/D conversion unit 202, a D/A conversion unit 
203, an amplifier 204, a speaker 205, a TALK switch 
206, a driver 207, a CPU 208, a RAM 209, a ROM 210, 
a bus line 212 and the like. The navigation apparatus 
1 00 and the voice unit 200 are connected with each oth- 
er via a communication line 211 . 

[0019] The GPS receiver 101 receives a signal from 
a GPS (Global Positioning System) satellite and detects 
the absolute position and the absolute bearing of the ve- 
hicle. The gyro sensor 102, which may be constituted 
of, for instance, a vibrating gyro, detects the yaw angle 
speed of the vehicle. The vehicle speed sensor 1 03 de- 
tects the distance traveled by the vehicle based upon 
the number of pulses output each time the vehicle has 
traveled over a specific distance. The two dimensional 
movement of the vehicle is detected by the gyro sensor 

102 and the vehicle speed sensor 103. The driver 104 
is provided to connect signals from the GPS receiver 
101, the gyro sensor 1 02 and the vehicle speed sensor 

103 with the bus line 110. In other words, the outputs 
from the individual sensors are converted to data that 
can be read by the CPU 105. 

[0020] The CPU 1 05 controls the entire navigation ap- 
paratus 100 by executing a program stored in the ROM 
107. In the RAM 106, which is constituted of volatile 
memory, a work data area is secured. In the ROM 107 
constituted of non volatile memory, the control program 
mentioned above and the like are stored. The CD-ROM 
drive 108 uses a CD-ROM 111 as a recording medium 
to store road map information such as vector road data 
and the like. The CD-ROM drive may be alternatively 
constitute <,l another recording device such as a DVD 
drive which uses a DVD as a recording medium. The 
display device 109 displays a road map that contains 
the current position and the surrounding area of the ve- 
hicle, route information indicating the route to the desti- 
nation, the intersection information indicating the next 
intersection to which the vehicle is to bo guided and the 
like. It may be constituted of, for instance, a liquid crystal 
display device or a CRT. The bus line 1 1 0 is provided to 
connect the components of the navigation apparatus 
100 such as the CPU 105 via a bus. 
[0021] The voice unit 200 performs voice-related 



processing such as voice recognition and voice synthe- 
sis. The TALK switch 206 is pressed by the user to give 
an instruction for a start of voice recognition. Audio data 
are input via the microphone 201 over a specific period 

5 of time after the TALK switch 206 is pressed. The sound 
thus input is converted to digital audio data by the A/D 
conversion unit 202 and the driver 207. 
[0022] In the ROM 210 of the voice unit 200, a voice 
recognition software program, a voice synthesis soft- 

10 ware program, voice recognition dictionaries (hereafter 
simply referred to as recognition dictionaries), a voice 
synthesis dictionary (hereafter simply referred to as a 
synthesis dictionary) and the like are stored. In the voice 
recognition software program, correlation values be- 

is tween the digital audio data and all the recognition 
words in a recognition dictionary are calculated and the 
recognition word achieving a largest correlation value is 
determined to be the recognition results. In the voice 
synthesis program, data needed to output a specified 

20 phrase through the speaker are calculated. Since both 
software programs are of the known art, their detailed 
explanation is omitted. 

[0023] A recognition dictionary is constituted of a set 
of data compiled with a plurality of words and phrases 

25 to be used in voice recognition. More specifically, pro- 
nunciation data corresponding to individual words spec- 
ified with Hiragana , Katakana, Roman characters or 
phonetic symbols (the corresponding character codes, 
in reality) are stored in the recognition dictionary. The 

30 words and phrases stored in the recognition dictionary 
are referred to as recognition words. The character data 
corresponding to the recognition word and information 
such as the corresponding coordinate information if the 
recognition word represents a facility name as well as 

35 the pronunciation data are attached to each recognition 
word. Details of the recognition dictionaries are to be 
given later. In the synthesis dictionary, sound source da- 
ta and the like necessary for voice synthesis are stored. 
[0024] When a speech is completed, the CPU 208 ex- 

40 ecutes the voice recognition software program by using 
the RAM 209, the ROM 210 and the like to perform a 
voice recognition of the digital audio data. The voice rec- 
ognition software program references the pronunciation 
data (data specified in Hiragana, Katakana or Roman 

. characters) of the recognition words in the recognition 
dictionary to generate the voice recognition data corre- 
sponding to the recognition words and calculates the 
correlation values between the voice recognition data 
and the digital audio data. It calculates the correlation 

50 values between all the recognition words and the digital 
audio data and determines the recognition word achiev- 
ing the highest correlation value which is also equal to 
or larger than a specific value before ending the voice 
recognition. The echo-back word linked to the recogni- 

55 tion word is then converted to special speech data by 
using the voice synthesis software program. Then, the 
CPU 208 engages the D/A conversion unit 203, the am- 
plifier 204 and the speaker 205 to output the recognition 
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results through echo-back. 

[0025] If all the correlation values thus calculated are 
equal to or smaller than the specific value, the CPU 208 
decides that voice recognition has failed and thus no 
navigation operation is executed. More specifically, it 
may sound a beep indicating that a voice recognition 
attempt has failed or it may sound a response echo-back 
such as "recognition failed." The bus line 21 2 is provided 
for the voice unit 200. 

[0026] Next, a detailed explanation is given on the 
recognition dictionaries. The recognition dictionaries in- 
clude a basic dictionary containing recognition words re- 
lated to instructions, prefecture name dictionaries con- 
taining recognition words related to prefecture names 
corresponding to various categories and prefecture fa- 
cility name dictionaries each containing recognition 
words related to the names of facilities in a given pre- 
fecture in a given category The prefecture names used 
to specify prefectures, too, should be regarded as a type 
of instruction phrase. 

[0027] FIGS. 2A — 2C show recognition dictionaries 
related to the ski resort category among the recognition 
dictionaries. The basic dictionary shown in FIG. 2A is a 
dictionary commonly used among various categories, 
and contains recognition words related to instructions 
such as "bird's eye view display," "enlarge , a "reduce" 
and "ski resorts." In the ski resort prefecture name dic- 
tionary shown in FIG. 2B, recognition words related to 
the names of prefectures where ski resorts are present 
are stored. In the ABCD Prefecture ski resort name dic- 
tionary shown in FIG. 2C, recognition words related to 
thenames of ski resorts located in ABCD Prefecture are 
stored, whereas in the EFGH Prefecture ski resort name 
dictionary in FIG. 2C, recognition words related to the 
names of ski resorts present in EFGH Prefecture are 
stored. In addition to the ABCD Prefecture ski resort 
name dictionary and the EFGH Prefecture ski resort 
name dictionary in FIG. 2C, ski resort dictionaries cor- 
responding to the individual prefectures listed in the ski 
resort prefecture name dictionary in FIG. 2B are provid- 
ed. 

[0028] A recognition word is constituted of pronunci- 
ation data for a given phrase, it is specified by hiragana, 
katakana, Roman character, pronunciation symbol or 
the like and the corresponding character code or the like 
is stored as the recognition word, the items in FIGS. 2A 
— 2C are expressed using Kanji and the like to facilitate 
the explanation. 

[0029] It is to be noted that the names of ski resorts 
in the entire country are stored in a hierarchical structure 
in units of individual prefectures for the following reason. 
Let us assume that a single ski resort name dictionary, 
in which the names of all the ski resorts in the country 
are stored, is provided without the ski resort prefecture 
name dictionary in FIG. 2B. In this case, for each ski 
resort name to be recognized through voice recognition, 
all the ski resort names in the recognition dictionary 
must undergo the voice recognition processing and a 



great deal of time will be required for the processing. In 
addition, since the number of items to undergo recogni- 
tion processing is large, the chance of erroneous recog- 
nition rises. Furthermore, the entire recognition diction- 

5 ary may not be opened in the memory at once due to 
limits imposed on the work memory capacity. Thus, the 
names of ski resorts in the country are stored in the hi- 
erarchical structure in units of individual prefectures and 
are processed as described above. 

10 [0030] If the golf course category is specified, a golf 
course prefecture name dictionary and golf course 
name dictionaries corresponding to the individual pre- 
fectures are prepared (not shown). The dictionaries re- 
lated to other categories such as theme parks are pre- 

15 pared in a similar manner. In other words, as recognition 
dictionaries, the basic dictionary, prefecture dictionaries 
in various categories and facility name dictionaries cor- 
responding to the individual prefectures in each catego- 
ry are prepared. 

20 [0031] In this embodiment, a neighboring prefecture 
table is stored in the ROM 210 in addition to the recog- 
nition dictionaries. FIG. 3 presents the neighboring pre- 
fecture table, a neighboring prefecture table 301 con- 
tains neighboring prefecture information for each of the 

2S 47 prefectures in the country (in case of Japan). 

Neighboring prefecture information 302 for each prefec- 
ture includes data indicating a prefecture code 303 
which represents the target prefecture itself, the number 
of neighboring prefectures 304 and neighboring prefec- 

30 ture codes 305. 

[0032] Any of various methods may be adopted to as- 
sign neighboring prefectures. For instance, all the pre- 
fectures geographically adjacent to a given prefecture 
at its prefectural border may be assigned, prefectures 

35 that are considered to be nearby may be assigned, pre- 
fectures which should be regarded as neighboring pre- 
fectures as dictated by experience may be assigned or 
prefectures located along an expressway passing 
through the prefecture may be assigned as neighboring 

40 prefectures. FIGS. 4A and 4B present an example of 
neighboring prefectures assigned for the individual pre- 
fectures in Japan. 

[0033] In the embodiment, if a given prefecture name 
is specified by voice when searching for a facility in a 

45 given category, the neighboring prefecture table de- 
scribed above is employed and the facility name diction- 
' ary corresponding to a neighboring^prefecture of the 
specified prefecture, too, is opened in memory. As a re- 
sult, when searching for a facility located near the pre- 

so fectural border, it can be found with ease even if a neigh- 
boring prefecture is specified by mistake. 
[0034] FIGS. 5 — 7 present a flowchart of the control 
implemented to search for the name of a facility located 
in a given prefecture on the voice unit 200. Now, an ex- 

55 planation is given on a specific example in which ABCD 
Prefecture is erroneously specified when searching for 
F Ski Resort located in EFGH Prefecture adjacent to AB- 
CD Prefecture. The control program, which is stored in 
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the ROM 210, is executed by the CPU 208. The routine 
is started up by turning on the power to the navigation 
apparatus 100 and the voice unit 200. 
[0035] in step SI , the basic dictionary shown in FIG. 
2A stored in the ROM 210 is read out and opened in the 
RAM 209. The basic dictionary in the ROM 210 is 
opened in the RAM 209 to increase the processing 
speed. If the processing speed is not a crucial issue, the 
dictionary in the ROM 210 may be accessed directly. In 
step S2, a decision is made as to whether or not the 
TALK switch 206 has been pressed, and if it is decided 
that the TALK switch 206 has been pressed, the opera- 
tion proceeds to step S3. If, on the other hand, it is de- 
cided that the TALK switch 206 has not been pressed, 
the routine ends. After pressing the TALK switch 206, 
the user says (vocalizes), for instance, "ski resorts" with- 
in a specific period of time. In step S3, the audio signal 
obtained though the microphone 201 is converted to dig- 
ital audio data. In step S4, a decision is made as to 
whether or not the speech has ended. A speech is 
judged to have ended when the audio signal lapses over 
a specific length of time. If it is decided that the speech 
has ended, the operation proceeds to step S5, whereas 
if it is decided that the speech has not ended, the oper- 
ation returns to step S3. In this example, digital audio 
data corresponding to "ski resorts" are obtained in step 
S3. 

[0036] In step S5, the correlation values between the 
digital audio data that have been obtained and all the 
recognition words in the basic dictionary are calculated 
before the operation proceeds to step S6. Namely, the 
correlation values between the digital audio data corre- 
sponding to "ski resorts" obtained in step S3 and the 
recognition words such as "bird's eye view display," "en- 
large," "reduce," "ski resorts" and "golf courses" are cal- 
culated. In step S6, a decision is made as to whether or 
not the largest correlation value among the calculated 
correlation values is equal to or larger than a specific 
value. If it is determined to be equal to or larger than the 
specific value, it is assumed that the word or phrase has 
been recognized and the operation proceeds to step S7. 
In this example, the correlation value relative to the rec- 
ognition word "ski resorts" is the largest. If the correla- 
tion value is equal to or larger than the specific value, it 
is decided that the phrase "ski resorts" has been recog- 
nized and a successful search of the category name has 
been achieveu. vtep S7, a voice message constituted 
of the recognition word thai has achieved the largest 
correlation value and "say the prefecture name" is out- 
put. In the example, a message "ski resorts," "say the 
prefecture name" is echoed back by voice. In addition, 
the prefecture name dictionary in the relevant category 
is prepared in the RAM 209 in step S7. In the example, 
the "ski resort prefecture name dictionary" (see FIG. 2B) 
is opened in the RAM 209. 

[0037] If, on the other hand, the largest correlation 
value is determined to be smallerthan the specific value 
in step S6, it is assumed that the spoken word or phrase 



has not been recognized and the operation proceeds to 
step S8. In step S8, a voice message "recognition failed" 
is echoed back before the processing ends. The navi- 
gation apparatus 100 does not engage in any process- 
5 ing. 

[0038] When the processing in step S7 is completed, 
the operation proceeds to step S9. In step S9, the audio 
signal obtained through the microphone 201 is convert- 
ed to digital audio data as in step S3. In step S10, a 

io decision is made as to whether not the speech has end- 
ed as in step S4. During this interval, the user says "AB- 
CD Prefecture." By repeating steps S9 and S1 0, the dig- 
ital audio data corresponding to "ABCD Prefecture" are 
obtained in the example. 

75 [0039] In step S1 1 , the correlation values between the 
digital audio data thus obtained and all the recognition 
words in the ski resort prefecture name dictionary are 
calculated before the operation proceeds to step SI 2. 
Namely, the correlation values between the digital audio 

20 data corresponding to "ABCD Prefecture* obtained in 
step S9 and the recognition words such as "Hokkaido," 
"Aomori Prefecture," "ABCD Prefecture," "EFGH Pre- 
fecture" and "Okinawa Prefecture 0 are calculated. In 
step S12, a decision is made as to whether or not the 

2S largest correlation value among the calculated correla- 
tion values is equal to or larger than a specific value. If, 
it is decided to be equal to or larger than the specific 
value, it is concluded that the word has been recognized 
and the operation proceeds to step S1 3. In the example, 

30 the correlation value relative to the recognition word 
"ABCD Prefecture" is the largest. If this correlation value 
is equal to or larger than the specific value, the phrase 
"ABCD Prefecture" has been recognized and the ski re- 
sort prefecture name has been successfully referenced. 

35 in step S13, a voice message constituted of the recog- 
nition word that has achieved the largest correlation val- 
ue and "say the facility name" is output. In the example, 
"ABCD Prefecture. Say the facility name" is echoed 
back. 

40 [0040] In addition, the facility name dictionary for the 
target prefecture and the facility name dictionary for a 
neighboring prefecture are opened in the RAM 209 in 
step S13. Since the name of the target prefecture has 
been obtained in step S12, the neighboring prefecture 
45 ;able (see FIG. 3) stored in the ROM 210 is accessed 
to obtain the neighboring prefecture information for the 
target prefecture. Based upon the neighboring prefec- 
ture information, the facility name dictionary corre- 
sponding to a neighboring prefecture is opened in the 
so RAM 209. As a result, the target prefecture facility name 
dictionary and a neighboring prefecture facility name 
dictionary are incorporated and are prepared in the RAM 
209 as if they constitute a single target prefecture facility 
name dictionary. In the example, in which EFGH Pre- 
ss fecture is a neighboring prefecture of ABCD Prefecture, 
"ABCD Prefecture ski resort name dictionary" and "EF- 
GH ski resort name dictionary" are incorporated and 
prepared in the RAM 209. 
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[0041 ] It is to be noted that if the ROM 2 1 0 is accessed 
directly instead of opening the prefecture facility name 
dictionaries in the ROM 210 in the RAM 209. the target 
prefecture facility name dictionary and the neighboring 
prefecture facility name dictionary alone may be ac- 
cessed sequentially. 

[0042] If, on the other hand, it is decided in step S12 
that the largest correlation value is smaller than the spe- 
cific value, it is decided that the spoken word or phrase 
has'not been recognized and the operation proceeds to 
step- S 14. In step S14, a voice message "recognition 
failed" is echoed back and the processing ends. The 
navigation apparatus 100 does not engage in any 
processing. 

[0043] After the processing in step S1 3 is completed, 
the operation proceeds to step S1 5. In step S1 5, the au- 
dio signal obtained through the microphone 201 is con- 
verted to digital audio data as in step S3. In stepSl6, a 
decision is made as to whether not the speech has end- 
ed as in slep S4. The user says, for instance, "F Ski 
Resort" during this interval. While the F Ski Resort is 
actually located in EFGH Prefecture, the user errone- 
ously believes that the F Ski Resort is in ABCD Prefec- 
ture, since it is located near the prefectural border of 
ABCD Prefecture and EFGH Prefecture. By repeating 
step S15 and step S16, the digital audio data corre- 
sponding to "F Ski Resort" are obtained. 
[0044] In step S1 7, the correlation values between the 
digital audio data that have been obtained and all the 
recognition words in the facility name dictionaries pre- 
pared in the RAM 209 are calculated, and the operation 
proceeds to step S18. As explained earlier, the facility 
name dictionary corresponding to the target prefecture 
and the facility name dictionary corresponding to the 
neighboring prefecture are prepared in the RAM 209, 
and the correlation values relative to all the recognition 
words in these dictionaries are calculated. In the exam- 
ple, correlation values between the digital audio data 
corresponding to "F Ski Resort" obtained in step S15 
and all the recognition words representing the ski resort 
names in the "ABCD Prefecture ski resort name diction- 
ary" and the "EFGH Prefecture ski resort name diction- 
ary" are calculated. 

[0045] In step S18, a decision is made as to whether 
or not the largest correlation value among the calculated 
correlation values is equal to or larger than a specific 
volume. If it is decided to be equal to or larger than the 
specific value, it is concluded that ihe word or phrase 
has been recognized and the operation proceeds to step 
S 1 9. 1 n the example, the correlation value relative to the 
recognition word "F Ski Resort" in the EFGH Prefecture 
ski resort name dictionary is the largest. If this correla- 
tion value is equal to or larger than the specific value, 
the phrase "F Ski Resort" has been recognized and a 
successful search of the facility name has been 
achieved. In step S19, the recognition word "F Ski Re- 
sort" achieving the largest correlation value is echoed 
back. 



[0046] In addition, in step S19, the navigation appa- 
ratus 100 is notified that a valid facility name has been 
recognized before the processing ends. While the nav- 
igation apparatus 100 is notified, the coordinates of the 

5 facility on the map are also provided to the navigation 
apparatus 100. Additional information constituted of co- 
ordinate data indicating the coordinates of the corre- 
sponding facility is also stored in the recognition diction- 
ary in correspondence to each recognition word. The 

io navigation apparatus 100 displays a road map of the ar- 
ea around the facility on the display device 109 based 
upon the coordinate data indicating the coordinates of 
the facility on the map transmitted via the communica- 
tion line 211. 

is [0047] If, on the other hand, the largest correlation 
value is determined to be smaller than the specific value 
in step SI 8, it is assumed that the spoken word has not 
been recognized and the operation proceeds to step 
S20. In step S20, "recognition failed" is echoed back by 

20 voice, before ending the processing. The navigation ap- 
paratus 1 00 does not engage in any processing, either. 
[0048] As described above, even if the user errone- 
ously specifies a neighboring prefecture when search- 
ing for a facility located in a given prefecture; the facility 

25 can bo referenced in a reliable manner. In the example 
given above, even if the user erroneously specifies the 
neighboring "ABCD Prefecture" when searching for "F 
Ski Resort" located in "EFGH Prefecture," "F Ski Resort" 
can be referenced with a high degree of reliability In ad- 

30 dition, since it is not necessary to provide recognition 
words for the names of all the facilities in the country in 
the work memory, the target facility can be searched ef- 
ficiently, quickly, accurately and reliably while requiring 
only a small work memory capacity. 

35 [0049] It is to be noted that while an explanation is 
given above on an example in which "F Ski Resort" is 
located only in "EFGH Prefecture," there may be anoth- 
er ski resort also called T Ski Resort" located in "ABCD 
Prefecture" by coincidence. In such a case, two corre- 

40 lation values achieving equally high levels will be refer- 
enced. These search results will be provided to the nav- 
igation apparatus 100, and in response, the following 
display will be brought up on the display device 109. It 
goes without saying that voice output may be concur- 

45 rently performed at the voice unit 200. 

•-1: F Ski Resort (ABCD Prefecture ) or; 
2: F Ski Resort (EFGH Prefecture) ?" 

50 [0050] The user inputs by voice the number he wishes 
to choose or inputs the number he wishes to choose 
through an input device (not shown) such as a remote 
control for the navigation apparatus. As a result, even 
when facilities having identical names are present in 

55 neighboring prefectures, the target lacility can be select- 
ed with ease. It is desirable to attach information related 
to the name of the prefecture in which a given facility is 
located to each recognition word in the facility recogni- 
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tion dictionary. In such a case, since the name of the 
prefecture in which the facility is located can be dis- 
played with ease in the selection screen described 
above, the user can make a selection without becoming 
confused. It goes without saying that the name of the 
prefecture may be ascertained and displayed by using 
the prefecture facility name dictionary containing the 
recognition word. It is to be noted that facilities with high- 
ly similar names located in neighboring prefectures, e. 
g., "F Ski Resort" located in "EFGH Prefecture" and "S 
Ski Resort" located in "ABCD Prefecture," may be han- 
dled in a similar manner. 

- Second Embodiment - 

[0051] An explanation has been given in reference to 
the first embodiment on an example in which the area 
is divided in units of individual prefectures in Japan. The 
dividing units of the area may be individual states in the 
USA, instead of the prefectures in Japan. 
[0052] FIGS. 8A ~ 8C show recognition dictionaries 
related to the ski resort category among recognition dic- 
tionaries, presenting an example in which the area is 
divided in units o1 individual states. They correspond to 
FIGS. 2A ~ 2C illustrating the first embodiment. In the 
ski resort state name dictionary shown in FIG. 8B, rec- 
ognition words corresponding to the names of states in 
which ski resorts are present are stored. !n FIG. 8C : the 
ABCD State ski resort name dictionary contains recog- 
nition words corresponding to the names of ski resorts 
present in ABCD State and the EFGH State ski resort 
name dictionary contains recognition words corre- 
sponding to the names of ski resorts present in EFGH 
State. Ski resort name dictionaries corresponding to all 
the states listed in the ski resort state name dictionary 
in FIG. 2B are provided in addition to the ABCD State 
ski resort name dictionary and the EFGH State ski resort 
name dictionary in FIG. 2C. 

[0053] In a recognition dictionary, spelling and the 
voice recognition data (e.g., phonetic symbols (pronun- 
ciation symbols)) of recognition words to undergo voice 
recognition processing are stored. Also, as in the first 
embodiment, information such as coordinate informa- 
tion is attached in the case of facility names. 
[0054] FIG. 9 shows a neighboring state table. It cor- 
responds to FIG. 3 illustrating the first embodiment and 
is similar to FIG. 3 e^^A for that the prefectures in FIG. 
3 are replaced by the slates. The assignment of neigh- 
boring states, too, may be made in a manner similar to 
the manner in which neighboring prefectures are as- 
signed in the first embodiment. 

[0055] Processing similar to that performed in the first 
embodiment is implemented by using the ski resort state 
name dictionary, the individual state ski resort name dic- 
tionary and the neighboring state table described above. 
Consequently, even if a neighboring state is specified 
by mistake when searching for a facility present in a giv- 
en state, the target facility can be referenced with a high 



degree of reliability. 

[0056] In the explanation given above, the area is di- 
vided in units of individual states in the United States. 
The present invention, however, may be adopted in con- 

5 junction with an area divided in units of public adminis- 
trative zone units used in other countries. In other words, 
recognition dictionaries can be prepared in correspond- 
ence to zones (e.g., states, prefectures, districts and 
countries) resulting from the divisions made in conform- 

to ance to the particulars of zone boundaries in the indi- 
vidual countries. In addition, if there are numerous small 
countries, as in Europe, the area may be divided in units 
of individual countries, as well. 

[0057] While an explanation is given above in refer- 

?5 ence to the embodiments on an example in which the 
present invention is adopted in a car navigation system, 
the present invention is not limited to this example. It 
may be adopted in portable navigation apparatuses in- 
stead of navigation apparatuses mounted in vehicles. In 

20 addition, it may be adopted in a guide system installed 
in a building as well. In short, it may be adopted in all 
types of systems or apparatuses on which a search tar- 
get among a plurality of search targets present in a plu- 
rality of divided zones is specified by voice. 

25 [0058] While an explanation is given in reference to 
the first embodiment on an example in which the area 
is divided in units of individual prefectures, the present 
invention is not limited to this example, and the area may 
be divided in units of smaller municipalities or in units of 

30 individual regions such as the Kanto Region, the Tokai 
Region and the Kinki Region. In addition, it may be di- 
vided in units of individual floors or individual specific 
ranges on a given floor in the case of a guide system 
installed in a building. Furthermore, the search blocks 

35 do not need to represent geographical divisions, either. 
For instance, if the basic dictionary in FIG. 2A contains 
a recognition word "Restaurants," the dictionary which 
is equivalent to the dictionary shown in FIG. 2B may con- 
tain recognition words indicating different types of res- 

40 taurants such as Trench cuisine," "Chinese cuisine" 
and "Japanese cuisine," and the dictionaries that are 
equivalent to those in FIG. 2C may each contain the 
names of restaurants specializing in each cuisine. Also, 
the present invention may be adopted effectively when 

45 differe.it types of "accommodations" are classified as 
"business hotels," "hotels" and "Japanese-style inns." 
In such a case, by assigning "business hotels" and "ho- 
tels" to classification categories that are similar to each 
other, a search can be performed with "business hotels" 

so added as a search target when "hotels" is specified. 
Thus, even if "hotels" is erroneously specified to search 
for "ABC Hotel" which is classified as a business hotel, 
a successful search is achieved. 

[0059] In addition, while an explanation is given in ref- 
55 erence to the embodiments on an example in which a 
ski resort located in a given public administrative zone, 
the present invention is not limited to these particulars. 
Any targets, including street names, airport names and 
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theme parks can be referenced. In other words, a search 
target may assume any form and its classification block, 
too, may assume any form in correspondence to the at- 
tributes of the search target. 

[0060] While an explanation is given above in refer- 
ence to the embodiments on a structure achieved by 
providing the navigation apparatus 100 and the voice 
unit 200 as separate units, the present invention is not 
limited to these particulars and may be adopted in an 
integrated navigation apparatus having an internal voice 
unit. In addition, the control program, the recognition dic- 
tionaries, the neighboring prefecture table and the like 
explained above may be provided in a recording medi- 
unrvsuch as the CD-ROM 111. Furthermore, the control 
program, the recognition dictionaries and the like may 
be provided in a recording medium such as a CD-ROM 
Til and the system described above may be realized 
on a computer such as a personal computer or a work- 
station. 

[0061 ] Alternately, the control program, the recogni- 
tion dictionaries, the neighboring prefecture dictionaries 
and the like may be provided via a transmission medium 
such as a communication line, a typical example of 
which is the Internet. In other words, the control program 
and the like may be converted to a signal that is trans- 
mitted through a transmission medium. FIG. 10 illus- 
trates how this may be realized. A navigation apparatus 
401 is the navigation apparatus explained earlier and 
has a function of connecting with a communication line 
402. A computer 403 is a server computer in which the 
control program and the like are stored so that the con- 
trol' program and the like can be provided to the 401. 
The communication line 402 may be a communication 
line for Internet communication or personal computer 
communication, or it may be a dedicated communica- 
tion line. The communication line 402 may be a tele- 
phone line or a wireless telephone line such as that for 
a mobile telephone connection. 

[0062] While an explanation is given above in refer- 
ence to the embodiments on an example in which when 
a successful search of a facility name is achieved in the 
voice unit 200 : the results of the search are provided to 
the navigation apparatus 100, and in response, the nav- 
igation apparat us 1 00 displays a map of the area around 
the facility as part of the navigation processing which 
includes route guidance, the present invention is not lim- 
ited to these particulars. Various types of navigation 
processing such as route search and route guidance 
may be implemented in the navigation apparatus 100 
based upon the results of a successful search per- 
formed by the voice unit 200. 

[0063] While an explanation is given above in refer- 
ence to the embodiments on an example in which a 
search is performed by using a facility name dictionary 
prepared by incorporating the prefecture facility name 
dictionary corresponding to the specified prefecture and 
a neighboring prefecture facility name dictionary in the 
RAM, the present invention is not restricted by these 



particulars. A search may be performed by giving the 
highest priority to the specified prefecture with neighbor- 
ing prefectures assigned with differing priority ranks. In 
addition, a search may be started using the facility name 

5 dictionary corresponding to the prefecture with the high- 
est priority, and the processing may be finished after 
completing the search if a correlation value achieving a 
level equal to or higher than a specific level is obtained 
in referencing the prefecture. 

to [0064] While an explanation is given above in refer- 
ence to the embodiments on an example in which the 
search target is specified through voice recognition, the 
present invention is not restricted by these particulars. 
It may be adopted when a search target is specified 

is through an input device such as a keyboard, as well. In 
other words, it may be adopted in all modes of a search 
executed in units of specific classification blocks instead 
of handling all the search targets at once. 

20 

Claims 

1 . A voice reference apparatus that classifies a plural- 
ity of search targets into a plurality of division 

25 blocks, searches for a search target by first speci- 
fying a division block and then specifying said 
search target and enables specification of, at least, 
said search target to be made by voice, comprising: 

30 a first storage device in which recognition data 

related to search targets corresponding to indi- 
vidual division blocks are stored; 
a second storage device in which division 
block-related information indicating one or 

35 more other division blocks related to a given di- 

vision block through a specific relationship is 
stored; 

a recognition data selection device that selects 
recognition data corresponding to only a certain 

40 division block and one or more other division 

blocks related to said certain division block 
specified by said division block-related infor- 
mation from said first storage device, when said 
certain division block has been specified; and 

45 a voice recognition processing device that per- 

forms voice recognition based upon voice rec- 
ognition data generated by using said recogni- 
tion data selected by said recognition data se- 
lection device and audio data corresponding to 

so said search target specified by voice. 

2. A voice reference apparatus according to claim 1, 
wherein: 

ss said plurality of division blocks are public ad- 

ministrative zones; 

said search target is located in one of said pub- 
lic administrative zones; and 
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said division block-related information indi- 
cates one or more other public administrative 
zones related to a specified public administra- 
tive zone through a specific relationship. 

3. A voice reference apparatus according to claim 2, 
wherein; 

said public administrative zones are each 
constituted of a prefecture. 

4. A voice reference apparatus according to claim 2, 
wherein; 

said public administrative zones are each 
constituted of a state. 

5. A voice reference apparatus according to claim 2, 
wherein; 

said public administrative zones are each 
constituted of a country. 

6. A voice reference apparatus according to claim 2, 
wherein; 

said division block-related information indi- 
cates one or more other public administrative zones 
adjacent to a specified public administrative zone. 

7. A voice reference apparatus according to claim 6, 
wherein; 

said recognition data related to said search 
target includes information related to a public ad- 
ministrative zone in which said search target is 
located . 

8. A voice reference apparatus according to claim 7, 
further comprising: 

a display control device that implements control 
to display details related to results of a search 
of said search target on a display device, 
wherein; 

when implementing control to display the de- 
tails related to the results of the search of said 
search target, said display control device also 
displays on said display device information re- 
lated to the public administrative zone in which 
said search target is located. 

9. A voice recognition navigation apparatus, compris- 
ing: 

a voice reference apparatus; 

a map information storage device that stores 

map information; and 

a control device that implements control for pro- 
viding route guidance based upon, at least, re- 
sults of a search performed by said voice ref- 
erence apparatus and said map information, 
wherein; 



18 

said voice reference apparatus, which classi- 
fies a plurality of search targets into a plurality 
of division blocks, searches for a search target 
by first specifying a division block and then 
specifying said search target and enables 
specification of, at least, said search target to 
be made by voice, comprises: 
a first storage device in which recognition data 
related to search targets corresponding to indi- 
go vidual division blocks are stored; 

a second storage device in which division 
block-related information indicating one or 
more other division blocks related to a given di- 
vision block through a specific relationship is 
75 stored; 

a recognition data selection device that selects 
recognition data corresponding to only a certain 
division block and one or more other division 
blocks related to said certain division block 
20 specified by said division block-relaled infor- 

mation from said first storage device, when said 
certain division block has been specified; and 
a voice recognition processing device that per- 
forms voice recognition based upon voice rec- 
25 ognition data generated by using said recogni- 

tion data selected by said recognition data se- 
lection device and audio data corresponding to 
said search target specified by voice. 

30 10. A voice reference control program for searching for 
a search target specified by voice, by fi rst specifying 
a division block and then specifying said search tar- 
get, comprising: 

55 an instruction for reading recognition data re- 

lated to search targets, a plurality of said search 
targets being classified into a plurality of divi- 
sion blocks; 

an instruction for reading data related to divi- 
40 sion block-related information indicating one or 

more other division blocks related to a given 
block through a specific relationship; 
an instruction for selecting recognition data cor- 
responding to only a certain division block and 
45 one or more other division blocks related to said 

certain division block specified by said division 
block-related information when said certain di- 
vision block has been specified; and 
an instruction for implementing a voice recog- 
50 nit ion based upon vojee recognition data gen- 

erated by using said recognition data that have 
been selected and audio data corresponding to 
said search target specified by voice. 

5 $ 11. A recording medium that records a voice reference 
control program according to claim 10. 

12. A data signal comprising a voice reference control 
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program according to claim 10 and transmitted in a 
communication line. 
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